18 research outputs found
Reconstruction de phase et de signaux audio avec des fonctions de coût non-quadratiques
Audio signal reconstruction consists in recovering sound signals from incomplete or degraded representations. This problem can be cast as an inverse problem. Such problems are frequently tackled with the help of optimization or machine learning strategies. In this thesis, we propose to change the cost function in inverse problems related to audio signal reconstruction. We mainly address the phase retrieval problem, which is common when manipulating audio spectrograms. A first line of work tackles the optimization of non-quadratic cost functions for phase retrieval. We study this problem in two contexts: audio signal reconstruction from a single spectrogram and source separation. We introduce a novel formulation of the problem with Bregman divergences, as well as algorithms for its resolution. A second line of work proposes to learn the cost function from a given dataset. This is done under the framework of unfolded neural networks, which are derived from iterative algorithms. We introduce a neural network based on the unfolding of the Alternating Direction Method of Multipliers, that includes learnable activation functions. We expose the relation between the learning of its parameters and the learning of the cost function for phase retrieval. We conduct numerical experiments for each of the proposed methods to evaluate their performance and their potential with audio signal reconstruction
Learning the Proximity Operator in Unfolded ADMM for Phase Retrieval
This paper considers the phase retrieval (PR) problem, which aims to
reconstruct a signal from phaseless measurements such as magnitude or power
spectrograms. PR is generally handled as a minimization problem involving a
quadratic loss. Recent works have considered alternative discrepancy measures,
such as the Bregman divergences, but it is still challenging to tailor the
optimal loss for a given setting. In this paper we propose a novel strategy to
automatically learn the optimal metric for PR. We unfold a recently introduced
ADMM algorithm into a neural network, and we emphasize that the information
about the loss used to formulate the PR problem is conveyed by the proximity
operator involved in the ADMM updates. Therefore, we replace this proximity
operator with trainable activation functions: learning these in a supervised
setting is then equivalent to learning an optimal metric for PR. Experiments
conducted with speech signals show that our approach outperforms the baseline
ADMM, using a light and interpretable neural architecture.Comment: 10 pages, 5 figures, submitted to IEEE SP
Sub-terahertz, microwaves and high energy emissions during the December 6, 2006 flare, at 18:40 UT
The presence of a solar burst spectral component with flux density increasing
with frequency in the sub-terahertz range, spectrally separated from the
well-known microwave spectral component, bring new possibilities to explore the
flaring physical processes, both observational and theoretical. The solar event
of 6 December 2006, starting at about 18:30 UT, exhibited a particularly
well-defined double spectral structure, with the sub-THz spectral component
detected at 212 and 405 GHz by SST and microwaves (1-18 GHz) observed by the
Owens Valley Solar Array (OVSA). Emissions obtained by instruments in
satellites are discussed with emphasis to ultra-violet (UV) obtained by the
Transition Region And Coronal Explorer (TRACE), soft X-rays from the
Geostationary Operational Environmental Satellites (GOES) and X- and gamma-rays
from the Ramaty High Energy Solar Spectroscopic Imager (RHESSI). The sub-THz
impulsive component had its closer temporal counterpart only in the higher
energy X- and gamma-rays ranges. The spatial positions of the centers of
emission at 212 GHz for the first flux enhancement were clearly displaced by
more than one arc-minute from positions at the following phases. The observed
sub-THz fluxes and burst source plasma parameters were found difficult to be
reconciled to a purely thermal emission component. We discuss possible
mechanisms to explain the double spectral components at microwaves and in the
THz ranges.Comment: Accepted version for publication in Solar Physic
The Changing Landscape for Stroke\ua0Prevention in AF: Findings From the GLORIA-AF Registry Phase 2
Background GLORIA-AF (Global Registry on Long-Term Oral Antithrombotic Treatment in Patients with Atrial Fibrillation) is a prospective, global registry program describing antithrombotic treatment patterns in patients with newly diagnosed nonvalvular atrial fibrillation at risk of stroke. Phase 2 began when dabigatran, the first non\u2013vitamin K antagonist oral anticoagulant (NOAC), became available. Objectives This study sought to describe phase 2 baseline data and compare these with the pre-NOAC era collected during phase 1. Methods During phase 2, 15,641 consenting patients were enrolled (November 2011 to December 2014); 15,092 were eligible. This pre-specified cross-sectional analysis describes eligible patients\u2019 baseline characteristics. Atrial fibrillation disease characteristics, medical outcomes, and concomitant diseases and medications were collected. Data were analyzed using descriptive statistics. Results Of the total patients, 45.5% were female; median age was 71 (interquartile range: 64, 78) years. Patients were from Europe (47.1%), North America (22.5%), Asia (20.3%), Latin America (6.0%), and the Middle East/Africa (4.0%). Most had high stroke risk (CHA2DS2-VASc [Congestive heart failure, Hypertension, Age 6575 years, Diabetes mellitus, previous Stroke, Vascular disease, Age 65 to 74 years, Sex category] score 652; 86.1%); 13.9% had moderate risk (CHA2DS2-VASc = 1). Overall, 79.9% received oral anticoagulants, of whom 47.6% received NOAC and 32.3% vitamin K antagonists (VKA); 12.1% received antiplatelet agents; 7.8% received no antithrombotic treatment. For comparison, the proportion of phase 1 patients (of N = 1,063 all eligible) prescribed VKA was 32.8%, acetylsalicylic acid 41.7%, and no therapy 20.2%. In Europe in phase 2, treatment with NOAC was more common than VKA (52.3% and 37.8%, respectively); 6.0% of patients received antiplatelet treatment; and 3.8% received no antithrombotic treatment. In North America, 52.1%, 26.2%, and 14.0% of patients received NOAC, VKA, and antiplatelet drugs, respectively; 7.5% received no antithrombotic treatment. NOAC use was less common in Asia (27.7%), where 27.5% of patients received VKA, 25.0% antiplatelet drugs, and 19.8% no antithrombotic treatment. Conclusions The baseline data from GLORIA-AF phase 2 demonstrate that in newly diagnosed nonvalvular atrial fibrillation patients, NOAC have been highly adopted into practice, becoming more frequently prescribed than VKA in Europe and North America. Worldwide, however, a large proportion of patients remain undertreated, particularly in Asia and North America. (Global Registry on Long-Term Oral Antithrombotic Treatment in Patients With Atrial Fibrillation [GLORIA-AF]; NCT01468701
Reconstruction de phase et de signaux audio avec des fonctions de coût non-quadratiques
Audio signal reconstruction consists in recovering sound signals from incomplete or degraded representations. This problem can be cast as an inverse problem. Such problems are frequently tackled with the help of optimization or machine learning strategies. In this thesis, we propose to change the cost function in inverse problems related to audio signal reconstruction. We mainly address the phase retrieval problem, which is common when manipulating audio spectrograms. A first line of work tackles the optimization of non-quadratic cost functions for phase retrieval. We study this problem in two contexts: audio signal reconstruction from a single spectrogram and source separation. We introduce a novel formulation of the problem with Bregman divergences, as well as algorithms for its resolution. A second line of work proposes to learn the cost function from a given dataset. This is done under the framework of unfolded neural networks, which are derived from iterative algorithms. We introduce a neural network based on the unfolding of the Alternating Direction Method of Multipliers, that includes learnable activation functions. We expose the relation between the learning of its parameters and the learning of the cost function for phase retrieval. We conduct numerical experiments for each of the proposed methods to evaluate their performance and their potential with audio signal reconstruction.La reconstruction de signaux audio consiste à estimer des signaux sonores à partir de représentations incomplètes ou dégradées. Ce problème peut être formulé comme un problème inverse. Ces derniers sont fréquemment traités à l'aide de stratégies d'optimisation ou d'apprentissage automatique. Dans cette thèse, on propose de modifier la fonction de coût dans les problèmes inverses liés à la reconstruction de signaux audio. On considère principalement le problème de reconstruction de phase, un problème fréquent lors de la manipulation de spectrogrammes audio. Un premier axe de ces travaux étudie l'optimisation de fonctions de coût non-quadratiques pour la reconstruction de phase. Ce problème est étudié dans deux contextes: la reconstruction de signaux audio à partir d'un spectrogramme et la séparation de sources. Nous proposons une nouvelle formulation du problème à l'aide des divergences de Bregman, ainsi que des algorithmes pour leur résolution. Un second axe considère l'apprentissage de la fonction de coût à partir d'un jeu de données. On utilise le cadre des réseaux de neurones dépliés, obtenus à partir d'algorithmes itératifs. On propose un réseau de neurones construit via le dépliement de l'algorithme des directions alternées et incluant des fonctions d'activations paramétrées. On explicite la relation entre l'apprentissage de ses paramètres et de la fonction de coût pour la reconstruction de phase. Enfin, on conduit un travail expérimental pour chaque méthode exposée dans cette thèse afin d'évaluer leur performance et leur potentiel pour la reconstruction de signaux audio
Reconstruction de phase et de signaux audio avec des fonctions de coût non-quadratiques
Audio signal reconstruction consists in recovering sound signals from incomplete or degraded representations. This problem can be cast as an inverse problem. Such problems are frequently tackled with the help of optimization or machine learning strategies. In this thesis, we propose to change the cost function in inverse problems related to audio signal reconstruction. We mainly address the phase retrieval problem, which is common when manipulating audio spectrograms. A first line of work tackles the optimization of non-quadratic cost functions for phase retrieval. We study this problem in two contexts: audio signal reconstruction from a single spectrogram and source separation. We introduce a novel formulation of the problem with Bregman divergences, as well as algorithms for its resolution. A second line of work proposes to learn the cost function from a given dataset. This is done under the framework of unfolded neural networks, which are derived from iterative algorithms. We introduce a neural network based on the unfolding of the Alternating Direction Method of Multipliers, that includes learnable activation functions. We expose the relation between the learning of its parameters and the learning of the cost function for phase retrieval. We conduct numerical experiments for each of the proposed methods to evaluate their performance and their potential with audio signal reconstruction.La reconstruction de signaux audio consiste à estimer des signaux sonores à partir de représentations incomplètes ou dégradées. Ce problème peut être formulé comme un problème inverse. Ces derniers sont fréquemment traités à l'aide de stratégies d'optimisation ou d'apprentissage automatique. Dans cette thèse, on propose de modifier la fonction de coût dans les problèmes inverses liés à la reconstruction de signaux audio. On considère principalement le problème de reconstruction de phase, un problème fréquent lors de la manipulation de spectrogrammes audio. Un premier axe de ces travaux étudie l'optimisation de fonctions de coût non-quadratiques pour la reconstruction de phase. Ce problème est étudié dans deux contextes: la reconstruction de signaux audio à partir d'un spectrogramme et la séparation de sources. Nous proposons une nouvelle formulation du problème à l'aide des divergences de Bregman, ainsi que des algorithmes pour leur résolution. Un second axe considère l'apprentissage de la fonction de coût à partir d'un jeu de données. On utilise le cadre des réseaux de neurones dépliés, obtenus à partir d'algorithmes itératifs. On propose un réseau de neurones construit via le dépliement de l'algorithme des directions alternées et incluant des fonctions d'activations paramétrées. On explicite la relation entre l'apprentissage de ses paramètres et de la fonction de coût pour la reconstruction de phase. Enfin, on conduit un travail expérimental pour chaque méthode exposée dans cette thèse afin d'évaluer leur performance et leur potentiel pour la reconstruction de signaux audio
Phase retrieval and audio signal reconstruction with non-quadratic cost functions
La reconstruction de signaux audio consiste à estimer des signaux sonores à partir de représentations incomplètes ou dégradées. Ce problème peut être formulé comme un problème inverse. Ces derniers sont fréquemment traités à l'aide de stratégies d'optimisation ou d'apprentissage automatique. Dans cette thèse, on propose de modifier la fonction de coût dans les problèmes inverses liés à la reconstruction de signaux audio. On considère principalement le problème de reconstruction de phase, un problème fréquent lors de la manipulation de spectrogrammes audio. Un premier axe de ces travaux étudie l'optimisation de fonctions de coût non-quadratiques pour la reconstruction de phase. Ce problème est étudié dans deux contextes: la reconstruction de signaux audio à partir d'un spectrogramme et la séparation de sources. Nous proposons une nouvelle formulation du problème à l'aide des divergences de Bregman, ainsi que des algorithmes pour leur résolution. Un second axe considère l'apprentissage de la fonction de coût à partir d'un jeu de données. On utilise le cadre des réseaux de neurones dépliés, obtenus à partir d'algorithmes itératifs. On propose un réseau de neurones construit via le dépliement de l'algorithme des directions alternées et incluant des fonctions d'activations paramétrées. On explicite la relation entre l'apprentissage de ses paramètres et de la fonction de coût pour la reconstruction de phase. Enfin, on conduit un travail expérimental pour chaque méthode exposée dans cette thèse afin d'évaluer leur performance et leur potentiel pour la reconstruction de signaux audio.Audio signal reconstruction consists in recovering sound signals from incomplete or degraded representations. This problem can be cast as an inverse problem. Such problems are frequently tackled with the help of optimization or machine learning strategies. In this thesis, we propose to change the cost function in inverse problems related to audio signal reconstruction. We mainly address the phase retrieval problem, which is common when manipulating audio spectrograms. A first line of work tackles the optimization of non-quadratic cost functions for phase retrieval. We study this problem in two contexts: audio signal reconstruction from a single spectrogram and source separation. We introduce a novel formulation of the problem with Bregman divergences, as well as algorithms for its resolution. A second line of work proposes to learn the cost function from a given dataset. This is done under the framework of unfolded neural networks, which are derived from iterative algorithms. We introduce a neural network based on the unfolding of the Alternating Direction Method of Multipliers, that includes learnable activation functions. We expose the relation between the learning of its parameters and the learning of the cost function for phase retrieval. We conduct numerical experiments for each of the proposed methods to evaluate their performance and their potential with audio signal reconstruction
Phase retrieval with Bregman divergences and application to audio signal recovery
23 pages, 4 figures, submitted to the IEEE Journal of Selected Topics in Signal ProcessingInternational audiencePhase retrieval (PR) aims to recover a signal from the magnitudes of a set of inner products. This problem arises in many audio signal processing applications which operate on a short-time Fourier transform magnitude or power spectrogram, and discard the phase information. Recovering the missing phase from the resulting modified spectrogram is indeed necessary in order to synthesize time-domain signals. PR is commonly addressed by considering a minimization problem involving a quadratic loss function. In this paper, we adopt a different standpoint. Indeed, the quadratic loss does not properly account for some perceptual properties of audio, and alternative discrepancy measures such as beta-divergences have been preferred in many settings. Therefore, we formulate PR as a new minimization problem involving Bregman divergences. We consider a general formulation that actually addresses two problems, since it accounts for the non-symmetry of these divergences in general. To optimize the resulting objective, we derive two algorithms based on accelerated gradient descent and alternating direction method of multiplier. Experiments conducted on audio signal recovery from either exact or modified spectrograms highlight the potential of our proposed methods for audio restoration. In particular, leveraging some of these Bregman divergences induce better performance than the quadratic loss when performing PR from highly degraded spectrograms
Phase recovery with Bregman divergences for audio source separation
International audienceTime-frequency audio source separation is usually achieved by estimating the short-time Fourier transform (STFT) magnitude of each source, and then applying a phase recovery algorithm to retrieve time-domain signals. In particular, the multiple input spectrogram inversion (MISI) algorithm has shown good performance in several recent works. This algorithm minimizes a quadratic reconstruction error between magnitude spectrograms. However, this loss does not properly account for some perceptual properties of audio, and alternative discrepancy measures such as beta-divergences have been preferred in many settings. In this paper, we propose to reformulate phase recovery in audio source separation as a minimization problem involving Bregman divergences. To optimize the resulting objective, we derive a projected gradient descent algorithm. Experiments conducted on a speech enhancement task show that this approach outperforms MISI for several alternative losses, which highlights their relevance for audio source separation applications
Learning the Proximity Operator in Unfolded ADMM for Phase Retrieval
International audienceThis paper considers the phase retrieval (PR) problem, which aims to reconstruct a signal from phaseless measurements such as magnitude or power spectrograms. PR is generally handled as a minimization problem involving a quadratic loss. Recent works have considered alternative discrepancy measures, such as the Bregman divergences, but it is still challenging to tailor the optimal loss for a given setting. In this paper we propose a novel strategy to automatically learn the optimal metric for PR. We unfold a recently introduced ADMM algorithm into a neural network, and we emphasize that the information about the loss used to formulate the PR problem is conveyed by the proximity operator involved in the ADMM updates. Therefore, we replace this proximity operator with trainable activation functions: learning these in a supervised setting is then equivalent to learning an optimal metric for PR. Experiments conducted with speech signals show that our approach outperforms the baseline ADMM, using a light and interpretable neural architecture